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THE NUMBER OF DISTINCT EIGENVALUES OF A MATRIX 
AFTER PERTURBATION 

P. E. FARRELL* 


Abstract. We prove a new theorem relating the number of distinct eigenvalues of a matrix after 
perturbation to the prior number of distinct eigenvalues, the rank of the update, and the degree 
of nondiagonalizability of the matrix. In particular, a rank one update applied to a diagonalizable 
matrix can at most double the number of distinct eigenvalues. The theorem applies to both symmetric 
and nonsymmetric matrices and perturbations, of arbitrary magnitudes. An an application, we prove 
that in exact arithmetic the number of Krylov iterations required to exactly solve a linear system 
involving a diagonalizable matrix can at most double after a rank one update. 
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1. Distinct eigenvalues after perturbation. The spectrum of a matrix after 
perturbation is of interest in a wide variety of applications and has been studied exten¬ 
sively in various particular cases, with most work focussing on the case of symmetric 
rank one perturbations [16, 5, 3, 9]. More general results concern the Jordan form of 
the matrix after “generic” rank one perturbations, i.e. the set of rank one perturba¬ 
tions for which the analysis does not hold has Lebesgue measure zero [7, 15, 11, 10]. 
In this work we prove a new theorem regarding the number of distinct eigenvalues of 
arbitrary matrices perturbed by updates of arbitrary rank. 

Let A(M) be the set of distinct eigenvalues of a matrix M. Let ma{M,X) and 
mg{M,X) be the algebraic and geometric multiplicity of A as an eigenvalue of M, 
respectively. 

Definition 1.1 (Defectivity of an eigenvalue). The defectivity of an eigenvalue 
d{M, X) > 0 is the difference between its algebraic and geometric multiplicities, 

d{M, X) = ma{M, X) — mg{M, X). ( 1 - 1 ) 

Definition 1.2 (Defectivity of a matrix). The defectivity of a matrix d{M) is 
the sum of the defectivities of its eigenvalues: 

d{M)= Y, {ma{M,X)-mg{M,X)). (1.2) 

AeA(M) 

Recall that ma{M, X) > mg{M, X) for all M and A. Thus, d{M, A) > 0, and d{M) > 0. 
Defectivity is a quantitative measure of nondiagonalizability: a matrix is diagonaliz¬ 
able if and only if it has defectivity zero. 

Remark 1. The defectivity of a matrix is clear from its Jordan form: it is the 
number of off-diagonal ones. 

We now give the central theorem of this paper. 
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Theorem 1.3. Let A,B e C”""". IfC = A + B, then |A(C')| < (rank(B) + 
l)\AiA)\+diA). 

Proof. Clearly |A(C)| = |A(C)nA(A)| + |A(C') \ A(A)|, and the first term is 
bounded by |A(A)|. We seek an upper bound for the quantity 

^ ma{C,X) (1.3) 

AgA(C) 

A^A(A) 

as this bounds the number of new eigenvalues that the perturbation can introduce. 
(Every eigenvalue A of C must have ma{C, A) > 1.) Since for M € 

ma{M,X)=n, (1.4) 

AeA(M) 

it follows that 

ma(C', A)+ ma{C,X)=n, (1.5) 

AeA(C) AeA(A) 

A^A(A) 

with the convention that ma{C,X) = 0 -4=^ A ^ A(C'). Thus, the upper bound on 
the number of new eigenvalues introduced is maximized when 

ma(C',A) (1.6) 

AeA(A) 

is minimized. 

Let A G A(A). We first investigate 7713 ( 17 , A), the geometric multiplicity of A 
as an eigenvalue of the perturbed matrix C. Using the fact that rank(Air + T) < 
rank(Ar) + rank(T), we derive a lower bound for mg{C, A): 

rank(A + B — XI) < rank(A — XI) + rank(i?) (1.7a) 

=> n — dim ker(A + B — XI) < n — dim ker(A — XI) + rank(i?) (1.7b) 

mg{C, X) > mg{A, X) — ia.nk{B). (l-7c) 

Hence, the geometric multiplicity of an eigenvalue can at most decrease by r on 
perturbation by a rank-r operator. 

It therefore follows that 

E ™.(C,A)> ^ mg{C,X) (1.8a) 

AeA(A) AeA(A) 

- (777g(A, A) - rank(H)) (by (1.7c)) (1.8b) 

AeA(A) 

= y^ (ma(A, X) — rank(il) — d{A, A)) (by Definition 1.1) 
AeA(A) 

(1.8c) 

= 77 — rank(i?) |A(A)| — d(A). (1.8d) 

The maximal number of new eigenvalues is achieved when (1.8) is an equality, 

and 

y^ 7770 ( 17 , A) = 77 — (77 — rank(H) |A(A)| — (i(A)) = rank(i?) |A(A)| + d(A). (1.9) 

AeA(C) 

A^A(A) 
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Hence 


|A(C)| = |A(C)nA(H)| + |A(C)\A(H)| 

< |A(A)| + rank(i?) |A(A)| + d{A) = (rank(i3) + 1) |A(A)| + d{A). (1-10) 

□ 

Corollary 1.4. Let A be diagonalizable {i.e., d{A) = 0) and let B have rank 
one. If C = A + B, then | A(C') | < 2 | A(A) |. 

2. Krylov iterations after a rank one update. Consider the linear systems 
Ax = b and Cy = d. If A is diagonalizable, then its minimal polynomial degree 
mpd(A) = |A(A)|, and an optimal Krylov method (GMRES [14], MINRES [13], or 
CG [6], if applicable) will compute x exactly in the same number of iterations. (Here, 
and henceforth, exact arithmetic is assumed.) 

Theorem 2.1. Consider the linear systems Ax = b and Cy = d. Let A be 
diagonalizable, and let B have rank one. If C = A + B, then y can be computed 
exactly with an optimal Krylov method in at most double the number of iterations 
reguired for x. 

Proof If C is diagonalizable, then mpd(C') = |A(C')| < 2|A(A)| = 2 mpd(A), 
i.e. the number of distinct eigenvalues bounds the number of Krylov iterations required 
to solve the perturbed matrix. 

If C is not diagonalizable, we know from (1.7c) that the number of Jordan blocks 
associated with an eigenvalue A S A(A) can decrease by at most 1 = rank(i?) in 
C. Since by diagonalizability of A all its Jordan blocks are of size 1x1, the largest 
Jordan block of C can be at most of size |A(A)| x |A(A)j, which can occur when all 
eigenvalues of A lose exactly one Jordan block. It is straightforward to calculate that 
with any arrangement of new Jordan blocks of C with sizes adding to jA(A)j, the 
number of Krylov iterations required to compute y is bounded by twice that of x. □ 

3. Application: Schur complement preconditioners and deflation. The¬ 
orem 1.3 is mainly of interest in situations where |A(A)j is expected to be small. Such 
a situation arises in the application of preconditioners based on Schur complements. 

Let F : K" —>• R" be the (discretized) residual of a nonlinear problem 


F{u) = 0 


with block-structured Jacobian 


J = 


A 

Z 


Y 

0 ’ 


(3.1) 


(3.2) 


with X invertible. This structure arises in many problems, including the Stokes and 
Navier-Stokes equations, and in equality-constrained optimization [1]. Linear systems 
involving J are typically solved with Schur complement preconditioners. Define 


P = 


X 

0 


0 

-S ’ 


(3.3) 


where the Schur complement S = —ZX~^Y. If P is used as a preconditioner, then the 
preconditioned operator P~^J is diagonalizable and has exactly three distinct eigen¬ 
values (with exact inner solves for the application of P~^) [12]. Similar results hold 
for more general block-structured Jacobians and preconditioners based on the Schur 
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complement: the preconditioned operator has a small number of distinct eigenvalues 

[ 8 ], 

Suppose (3.1) supports multiple solutions. One approach to compute them is 
to initialize Newton’s method from many different initial guesses, hoping to start in 
different basins of convergence. A highly effective alternative is to deflate known 
solutions [4]. Suppose one solution u* of (3.1) has been computed from an initial 
guess uo and additional solutions are sought. We construct a modified residual 

G{u) = M{u;ul)F{u), (3.4) 

via the application of a deflation operator M : R" x K." —>• R. to the residual F. This 
deflation operator guarantees two properties. The first is the preservation of solutions 
of F, i.e. for u ^ u\, G{u) = 0 <;=> F{u) = 0. The second is that Newton’s method 
(or other rootfinding algorithms) applied to G will not discover u* again, as 

liminf ||G(u)|| > 0, (3.5) 

U—¥U^ 


i.e. along any sequence converging to the known root, its existence is masked by the 
nonconvergence of the deflated residual to zero. (M achieves this by introducing a 
pole of the appropriate strength at the known solution.) Thus, if Newton’s method 
applied to G converges from mq, it will converge to another solution U 2 ^ u\. A 
typical deflation operator is 


M(u; ul) 



+ 


(3.6) 


where p controls the strength of the pole introduced. 

The process can then be repeated until no more solutions are found from uq in a 
fixed number of Newton iterations. Several solutions can be deflated with an operator 
A/ : R" X R" X • • • R" —R via 


M{i 


l) = 


u - ut 


(3.7) 


For full details, see Brown and Gearhart [2] and Farrell et al. [4]. 

The Jacobian J of the deflated problem (3.4) is a rank one update of a scaling 
of the Jacobian of the original problem (3.1), regardless of the number of solutions 
deflated: 


J = MJ + FE^, (3.8) 

where E = M' € R". Hence, the preconditioned deflated Jacobian is also a rank one 
update of the preconditioned original Jacobian, 

G = P-^J = MP-\J + {P-^F)E’^ = A +B, (3.9) 

and Theorem 2.1 guarantees that the solutions of linear systems involving the deflated 
Jacobian can be computed exactly in no more than twice the number of Krylov 
iterations required for the undeflated Jacobian. 
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